Turkish Discourse Bank: Porting a discourse annotation style to a morphologically rich language
نویسنده
چکیده
This paper describes the current state of the Turkish Discourse Bank, the first publicly available annotated discourse resource for Turkish. It describes the annotation methods and the challenges posed by annotating Turkish, a free word order language with rich morphology. It shows the usefulness of the PDTB style annotation but points out the need to expand this annotation style with the needs of the target language.
منابع مشابه
The Annotation Scheme of the Turkish Discourse Bank and an Evaluation of Inconsistent Annotations
In this paper, we report on the annotation procedures we developed for annotating the Turkish Discourse Bank (TDB), an effort that extends the Penn Discourse Tree Bank (PDTB) annotation style by using it for annotating Turkish discourse. After a brief introduction to the TDB, we describe the annotation cycle and the annotation scheme we developed, defining which parts of the scheme are an exten...
متن کاملCross-Domain and Cross-Language Porting of Shallow Parsing
English was the main focus of attention of the Natural Language Processing (NLP) community for years. As a result, there are significantly more annotated linguistic resources in English than in any other language. Consequently, data-driven tools for automatic text or speech processing are developed mainly for English. Developing similar corpora and tools for other languages is an important issu...
متن کاملPDTB-style Discourse Annotation of Chinese Text
We describe a discourse annotation scheme for Chinese and report on the preliminary results. Our scheme, inspired by the Penn Discourse TreeBank (PDTB), adopts the lexically grounded approach; at the same time, it makes adaptations based on the linguistic and statistical characteristics of Chinese text. Annotation results show that these adaptations work well in practice. Our scheme, taken toge...
متن کاملAnnotating Subordinators in the Turkish Discourse Bank
In this paper we explain how we annotated subordinators in the Turkish Discourse Bank (TDB), an effort that started in 2007 and is still continuing. We introduce the project and describe some of the issues that were important in annotating three subordinators, namely karşın, rağmen and halde, all of which encode the coherence relation Contrast-Concession. We also describe the annotation tool.
متن کاملTDB 1.1: Extensions on Turkish Discourse Bank
In this paper we present the recent developments on Turkish Discourse Bank (TDB). We first summarize the resource and present an evaluation. Then, we describe TDB 1.1, i.e. enrichments on 10% of the corpus (namely, added senses for explicit discourse connectives and new annotations for implicit relations, entity relations and alternative lexicalizations). We explain the method of annotation and
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- D&D
دوره 4 شماره
صفحات -
تاریخ انتشار 2013